首页> 外文OA文献 >CHAOS : A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi

【2h】

CHAOS : A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi

机译：CHAOS：在英特尔至强融核上训练卷积神经网络的并行化方案

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep learning is an important component of big-data analytic tools and intelligent applications, such as, self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive, and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time of deep neural networks.In this paper, we present our parallelization scheme for training convolutional neural networks (CNN) named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). Major features of CHAOS include the support for thread and vector parallelism, non-instant updates of weight parameters during back-propagation without a significant delay, and implicit synchronization in arbitrary order. CHAOS is tailored for parallel computing systems that are accelerated with the Intel Xeon Phi. We evaluate our parallelization approach empirically using measurement techniques and performance modeling for various numbers of threads and CNN architectures. Experimental results for the MNIST dataset of handwritten digits using the total number of threads on the Xeon Phi show speedups of up to 103x compared to the execution on one thread of the Xeon Phi, 14x compared to the sequential execution on Intel Xeon E5, and 58x compared to the sequential execution on Intel Core i5.

机译：深度学习是大数据分析工具和智能应用程序（例如自动驾驶汽车，计算机视觉，语音识别或精密医学）的重要组成部分。然而，训练过程是计算密集型的，并且如果顺序执行则经常需要大量时间。现代并行计算系统提供了减少所需深度神经网络训练时间的能力。在本文中，我们提出了一种用于训练卷积神经网络（CNN）的并行化方案，该方案称为具有任意同步阶数（CHAOS）的受控Hogwild。 CHAOS的主要功能包括对线程和向量并行性的支持，反向传播期间权重参数的非即时更新而没有明显的延迟以及任意顺序的隐式同步。 CHAOS专为使用Intel Xeon Phi加速的并行计算系统而设计。我们针对各种数量的线程和CNN架构，使用测量技术和性能建模经验性地评估了我们的并行化方法。使用Xeon Phi上线程总数的MNIST手写数字数据集的实验结果显示，与在Xeon Phi的一个线程上执行相比，速度提高了103倍，在Intel Xeon E5上的顺序执行上提高了14倍，而58x与Intel Core i5上的顺序执行相比。

著录项

作者
Viebke, Andre; Memeti, Suejb; Pllana, Sabri; Abraham, Ajith;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi [J] . Viebke Andre, Memeti Suejb, Pllana Sabri, Journal of supercomputing . 2019,第1期

机译：CHAOS：在英特尔至强融核上训练卷积神经网络的并行化方案
2. Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel®Xeon Phi™ Coprocessor [J] . Misra Sanchit, Pamnany Kiran, Aluru Srinivas Computational Biology and Bioinformatics, IEEE/ACM Transactions on . 2015,第5期

机译：基于并行互信息的英特尔®至强融核™协处理器上的基因组规模网络的构建
3. HGP4CNN: an efficient parallelization framework for training convolutional neural networks on modern GPUs [J] . Fu Hao, Tang Shanjiang, He Bingsheng, Journal of supercomputing . 2021,第11期

机译：HGP4CNN：用于培训现代GPU的卷积神经网络的有效平行化框架
4. Training Large Scale Deep Neural Networks on the Intel Xeon Phi Many-Core Coprocessor [C] . Lei Jin, Zhaokang Wang, Rong Gu, IEEE International Parallel Distributed Processing Symposium . 2014

机译：在英特尔至强融核多核协处理器上训练大规模深度神经网络
5. An Analysis of Variation Between Cores for Intel Xeon Phi Knights Corner and Xeon Phi Knights Landing. [D] . Robinson, Jamar. 2017

机译：英特尔至强披披骑士角和至强披披骑士登陆的内核之间的差异分析。
6. Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters [O] . Haidong Lan, Yuandong Chan, Kai Xu, 2016

机译：基于Xeon-Phi簇的大规模生物序列比对的并行算法
7. CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi [O] . André Viebke, Suejb Memeti, Sabri Pllana, 2017

机译：CHAOS：在英特尔至强融核上训练卷积神经网络的并行化方案

CHAOS : A Parallelization Scheme for Training Convolutional Neural Networks on Intel Xeon Phi

摘要

著录项

相似文献

相关主题

期刊订阅